AITopics | unsupervised model adaptation

Collaborating Authors

unsupervised model adaptation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Model-based Policy Optimization with Unsupervised Model Adaptation

Neural Information Processing SystemsDec-23-2025, 20:06:32 GMT

Model-based reinforcement learning methods learn a dynamics model with real data sampled from the environment and leverage it to generate simulated data to derive an agent. However, due to the potential distribution mismatch between simulated data and real data, this could lead to degraded performance. Despite much effort being devoted to reducing this distribution mismatch, existing methods fail to solve it explicitly. In this paper, we investigate how to bridge the gap between real and simulated data due to inaccurate model estimation for better policy optimization. To begin with, we first derive a lower bound of the expected return, which naturally inspires a bound maximization algorithm by aligning the simulated and real data distributions. To this end, we propose a novel model-based reinforcement learning framework AMPO, which introduces unsupervised model adaptation to minimize the integral probability metric (IPM) between feature distributions from real and simulated data. Instantiating our framework with Wasserstein-1 distance gives a practical model-based approach. Empirically, our approach achieves state-of-the-art performance in terms of sample efficiency on a range of continuous control benchmark tasks.

model-based policy optimization, name change, unsupervised model adaptation, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.83)

Add feedback

Appendix for Model based Policy Optimization with Unsupervised Model Adaptation A Omitted Proofs

Neural Information Processing SystemsOct-2-2025, 09:11:26 GMT

Besides Wasserstein distance, we can use other distribution divergence metrics to align the features. MMD is another instance of IPM when the witness function class is the unit ball in a reproducing kernel Hilbert space (RKHS). The results on three environments are shown in Figure 5. We show the one-step model losses during the experiments in the other four environments in Figure D.5. We find that the conclusion in Section 5.2 still holds in these four environments.

ampo, artificial intelligence, machine learning, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Model-based Policy Optimization with Unsupervised Model Adaptation

Neural Information Processing SystemsOct-9-2024, 16:55:33 GMT

distribution mismatch, model-based policy optimization, unsupervised model adaptation, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)

Add feedback